@nilseling
@NilsEling

1 Data and code availability

To follow this tutorial, please visit https://github.com/BodenmillerGroup/demos/tree/main/docs. The compiled .html file of this workshop is hosted at: https://bodenmillergroup.github.io/demos.

We will need to install the following packages for the workshop:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install(c("imcRtools", "tidyverse", "patchwork",
                       "ggplot2", "viridis", "pheatmap", "scales")))

To reproduce the analysis, clone the repository:

git clone https://github.com/BodenmillerGroup/demos.git

and open the EuroBioc2022_workshop.Rmd file in the docs folder.

2 Introduction

Highly multiplexed imaging enables the simultaneous detection of tens of biological molecules (e.g. proteins, RNA; also referred to as “markers”) in their spatial tissue context. Recently established multiplexed imaging technologies rely on cyclic staining with immunofluorescently-tagged antibodies (Lin et al. 2018; Gut, Herrmann, and Pelkmans 2018), or the use of oligonucleotide-tagged (Saka et al. 2019; Goltsev et al. 2018) or metal-tagged antibodies (Giesen et al. 2014; Angelo et al. 2014), among others. Across technologies, the acquired data are commonly stored as multi-channel images, where each pixel encodes the abundance of all acquired markers at a specific position in the tissue. After data acquisition, bioimage processing and segmentation are conducted to extract data for downstream analysis. When performing end-to-end multiplexed image analysis, the user is often faced with a diverse set of computational tools and complex analysis scripts. We developed an interoperabale, modularized computational workflow to process and analyze multiplexed imaging data (Figure 1). The steinbock framework facilitates multi-channel image processing including raw data pre-processing, image segmentation and feature extraction. Data generated by steinbock can be directly read by the imcRtools R/Bioconductor package for data visualization and spatial analysis (Figure 1). The cytomapper package support image handling and composite as well as segmentation mask visualization.

Figure 1: Overview of the multiplexed image processing and analysis workflow. Raw image data can be interactively visualized using napari plugins such as napari-imc for IMC, to assess data quality and for exploratory visualization. The steinbock framework performs image pre-processing, cell segmentation and single-cell data extraction using established approaches and standardized file formats. Data can be imported into R using the imcRtools package, which further supports spatial visualization and analysis. Storing the data in a SingleCellExperiment or SpatialExperiment object, imcRtools integrates with a variety of data analysis tools of the Bioconductor project such as cytomapper (Eling et al. 2020). Alternatively, steinbock exports data to the anndata format for analysis in Python, e.g. using squidpy.

The workshop focuses on the spatial analysis of single-cell data obtained from multiplexed imaging technologies. While the concepts presented here can be applied in a technology-agnostic way, for demonstration purposes, we present data from Imaging Mass Cytometry (IMC), which relies on tissue staining with metal-labelled antibodies to jointly measure the spatial distribution of up to 40 proteins or RNA at 1μm resolution (Giesen et al. 2014; Schulz et al. 2018).

The workshop highlights the following analysis steps using the imcRtools R/Bioconductor package:

  1. Construction and visualization of spatial object graphs
  2. Cellular neighborhood detection
  3. Spatial context detection
  4. Patch detection
  5. Cell type/cell type interaction analysis

The analysis approaches presented here were taken from the IMC data analysis book. The book provides more detailed information on the technical underpinnings of the analysis.

More information can also be found in our preprint

3 Download example data

To highlight spatial data analysis approaches, we provide example data that were acquired as part of the Integrated iMMUnoprofiling of large adaptive CANcer patient cohorts projects (immucan.eu). The spatially annotated, single-cell data can be obtained via:

download.file("https://zenodo.org/record/6810879/files/spe.rds",
              "../data/spe.rds")

(spe <- readRDS("../data/spe.rds"))
## class: SpatialExperiment 
## dim: 40 46825 
## metadata(4): color_vectors cluster_codes SOM_codes delta_area
## assays(2): counts exprs
## rownames(40): MPO HistoneH3 ... DNA1 DNA2
## rowData names(15): channel name ... marker_class used_for_clustering
## colnames(46825): Patient1_001_1 Patient1_001_2 ... Patient4_008_2772
##   Patient4_008_2773
## colData names(20): sample_id ObjectNumber ... cell_labels celltype
## reducedDimNames(8): UMAP TSNE ... seurat UMAP_seurat
## mainExpName: NULL
## altExpNames(0):
## spatialCoords names(2) : Pos_X Pos_Y
## imgData names(1): sample_id

This SpatialExperiment object contains single-cell data from 4 patients and 14 IMC images. Each column entry represents a single-cell and each row entry represents a single marker. We quantify protein abundance as the mean pixel intensity per marker and cell. These abundance measures are stored in the counts assay and their asinh-transformed values are stored in the exprs assay.

unique(spe$patient_id)
## [1] "Patient1" "Patient2" "Patient3" "Patient4"
unique(spe$sample_id)
##  [1] "Patient1_001" "Patient1_002" "Patient1_003" "Patient2_001" "Patient2_002"
##  [6] "Patient2_003" "Patient2_004" "Patient3_001" "Patient3_002" "Patient3_003"
## [11] "Patient4_005" "Patient4_006" "Patient4_007" "Patient4_008"
unique(spe$indication)
## [1] "SCCHN" "BCC"   "NSCLC" "CRC"
unique(spe$celltype)
##  [1] "Tumor"       "Stroma"      "Myeloid"     "Plasma_cell" "Treg"       
##  [6] "CD8"         "CD4"         "undefined"   "BnTcell"     "Bcell"      
## [11] "Neutrophil"
counts(spe)[1:5,1:5]
##           Patient1_001_1 Patient1_001_2 Patient1_001_3 Patient1_001_4
## MPO            0.6273888      0.4500000      0.5286462      1.0191420
## HistoneH3      3.4095933     13.0472305      2.5274191      9.5967590
## SMA            0.0000000      0.8347940      0.0000000      0.5937160
## CD16           2.1288451      2.7879388      2.1463270     18.4293193
## CD38           0.1471510      0.5566445      0.9559316      0.5434799
##           Patient1_001_5
## MPO            0.4000000
## HistoneH3      2.8898319
## SMA            0.0000000
## CD16           0.8185134
## CD38           0.3812283
assay(spe, "exprs")[1:5,1:5]
##           Patient1_001_1 Patient1_001_2 Patient1_001_3 Patient1_001_4
## MPO            0.5921683      0.4360497      0.5066859      0.8948444
## HistoneH3      1.9405827      3.2631884      1.6573665      2.9572761
## SMA            0.0000000      0.7596078      0.0000000      0.5634290
## CD16           1.4998154      1.7491645      1.5072232      3.6078253
## CD38           0.1466251      0.5312943      0.8498667      0.5197596
##           Patient1_001_5
## MPO            0.3900353
## HistoneH3      1.7830203
## SMA            0.0000000
## CD16           0.7470596
## CD38           0.3725504

The spatial location of each cell is stored in the spatialCoords slot.

head(spatialCoords(spe))
##      Pos_X     Pos_Y
## 1 468.8182 0.3636364
## 2 516.4500 0.4000000
## 3 587.5000 0.5000000
## 4 192.2632 0.8947368
## 5 231.5000 0.4000000
## 6 270.8000 0.8500000

During image processing, the steinbock framework extracts spatial object graphs indicating cells in close physical proximity in form of an edgelist. This interaction graph is stored in:

colPair(spe, type = "neighborhood")
## SelfHits object with 246584 hits and 0 metadata columns:
##                 from        to
##            <integer> <integer>
##        [1]         1        27
##        [2]         1        54
##        [3]         2        10
##        [4]         2        43
##        [5]         3        16
##        ...       ...       ...
##   [246580]     46824     46802
##   [246581]     46825     46762
##   [246582]     46825     46787
##   [246583]     46825     46796
##   [246584]     46825     46820
##   -------
##   nnode: 46825

We can also visualize all cells.

library(imcRtools)
library(ggplot2)

plotSpatial(spe, 
            node_color_by = "celltype", 
            img_id = "sample_id", 
            node_size_fix = 0.5) + 
    scale_color_manual(values = metadata(spe)$color_vectors$celltype)

4 Spatial interaction graphs

Many spatial analysis approaches either compare the observed versus expected number of cells around a given cell type (point process) or utilize interaction graphs (spatial object graphs) to estimate clustering or interaction frequencies between cell types.

The steinbock framework allows the construction of these spatial graphs. During image processing, we have constructed a spatial graph by expanding the individual cell masks by 4 pixels.

The imcRtools package further allows the ad hoc construction of spatial graphs directly using a SpatialExperiment or SingleCellExperiment object while considering the spatial location (centroids) of individual cells. The buildSpatialGraph function allows constructing spatial graphs by detecting the k-nearest neighbors in 2D (knn), by detecting all cells within a given distance to the center cell (expansion) and by Delaunay triangulation (delaunay).

When constructing a knn graph, the number of neighbors (k) needs to be set and (optionally) the maximum distance to consider (max_dist) can be specified. When constructing a graph via expansion, the distance to expand (threshold) needs to be provided. For graphs constructed via Delaunay triangulation, the max_dist parameter can be set to avoid unusually large connections at the edge of the image.

spe <- buildSpatialGraph(spe, img_id = "sample_id", type = "knn", k = 20)
spe <- buildSpatialGraph(spe, img_id = "sample_id", type = "expansion", threshold = 20)
spe <- buildSpatialGraph(spe, img_id = "sample_id", type = "delaunay", max_dist = 50)

The spatial graphs are stored in colPair(spe, name) slots. These slots store SelfHits objects representing edge lists in which the first column indicates the index of the “from” cell and the second column the index of the “to” cell. Each edge list is newly constructed when subsetting the object.

colPairNames(spe)
## [1] "neighborhood"                "knn_interaction_graph"      
## [3] "expansion_interaction_graph" "delaunay_interaction_graph"

Here, colPair(spe, "neighborhood") stores the spatial graph constructed by steinbock, colPair(spe, "knn_interaction_graph") stores the knn spatial graph, colPair(spe, "expansion_interaction_graph") stores the expansion graph and colPair(spe, "delaunay_interaction_graph") stores the graph constructed by Delaunay triangulation.

5 Spatial visualization

Here, we introduce the plotSpatial function of the imcRtools package to visualize the cells’ centroids and cell-cell interactions as spatial graphs.

In the following example, we select one image for visualization purposes. Here, each dot (node) represents a cell and edges are drawn between cells in close physical proximity as detected by steinbock or the buildSpatialGraph function. Nodes are variably colored based on the cell type and edges are colored in grey.

library(viridis)

# steinbock interaction graph 
plotSpatial(spe[,spe$sample_id == "Patient3_001"], 
            node_color_by = "celltype", 
            img_id = "sample_id", 
            draw_edges = TRUE, 
            colPairName = "neighborhood", 
            nodes_first = FALSE, 
            edge_color_fix = "grey") + 
    scale_color_manual(values = metadata(spe)$color_vectors$celltype) +
    ggtitle("steinbock interaction graph")

# knn interaction graph 
plotSpatial(spe[,spe$sample_id == "Patient3_001"], 
            node_color_by = "celltype", 
            img_id = "sample_id", 
            draw_edges = TRUE, 
            colPairName = "knn_interaction_graph", 
            nodes_first = FALSE,
            edge_color_fix = "grey") + 
    scale_color_manual(values = metadata(spe)$color_vectors$celltype) +
    ggtitle("knn interaction graph")

# expansion interaction graph 
plotSpatial(spe[,spe$sample_id == "Patient3_001"], 
            node_color_by = "celltype", 
            img_id = "sample_id", 
            draw_edges = TRUE, 
            colPairName = "expansion_interaction_graph", 
            nodes_first = FALSE, 
            directed = FALSE,
            edge_color_fix = "grey") + 
    scale_color_manual(values = metadata(spe)$color_vectors$celltype) +
    ggtitle("expansion interaction graph")

# delaunay interaction graph 
plotSpatial(spe[,spe$sample_id == "Patient3_001"], 
            node_color_by = "celltype", 
            img_id = "sample_id", 
            draw_edges = TRUE, 
            colPairName = "delaunay_interaction_graph", 
            nodes_first = FALSE,
            edge_color_fix = "grey") + 
    scale_color_manual(values = metadata(spe)$color_vectors$celltype) +
    ggtitle("delaunay interaction graph")

5.1 Spatial community analysis

The detection of spatial communities was proposed by (Jackson et al. 2020). Here, cells are clustered solely based on their interactions as defined by the spatial object graph. In the following example, we perform spatial community detection separately for tumor and stromal cells.

The general procedure is as follows:
1. subset the object to contain either tumor or stromal cells
2. create an igraph object from the edge list stored in colPair(tumor_spe, "neighborhood")
3. perform community detection using the Louvain algorithm
4. store the community IDs in a vector and replace all communities with a size smaller than 10 by NA

Both tumor and stromal spatial communities are stored in the colData of the SpatialExperiment object.

library(igraph)
set.seed(220819)

# Spatial community detection - tumor
tumor_spe <- spe[,spe$celltype == "Tumor"]
gr <- graph_from_data_frame(as.data.frame(colPair(tumor_spe, "neighborhood")), 
                            directed = FALSE, 
                            vertices = data.frame(index = seq_len(ncol(tumor_spe))))
cl_comm <- cluster_louvain(gr)
comm_tumor <- paste0("Tumor_", membership(cl_comm))
comm_tumor[membership(cl_comm) %in% which(sizes(cl_comm) < 10)] <- NA
names(comm_tumor) <- colnames(tumor_spe)

# Spatial community detection - non-tumor
stroma_spe <- spe[,spe$celltype != "Tumor"]
gr <- graph_from_data_frame(as.data.frame(colPair(stroma_spe, "neighborhood")), 
                            directed = FALSE, 
                            vertices = data.frame(index = seq_len(ncol(stroma_spe))))
cl_comm <- cluster_louvain(gr)
comm_stroma <- paste0("Stroma_", membership(cl_comm))
comm_stroma[membership(cl_comm) %in% which(sizes(cl_comm) < 10)] <- NA
names(comm_stroma) <- colnames(stroma_spe)
comm <- c(comm_tumor, comm_stroma)
spe$spatial_community <- comm[colnames(spe)]

We can now separately visualize the tumor and stromal communities.

plotSpatial(spe[,spe$celltype == "Tumor"], 
            node_color_by = "spatial_community", 
            img_id = "sample_id", 
            node_size_fix = 0.5) +
    theme(legend.position = "none") +
    ggtitle("Spatial tumor communities") +
    scale_color_manual(values = rev(colors()))

plotSpatial(spe[,spe$celltype != "Tumor"], 
            node_color_by = "spatial_community", 
            img_id = "sample_id", 
            node_size_fix = 0.5) +
    theme(legend.position = "none") +
    ggtitle("Spatial non-tumor communities") +
    scale_color_manual(values = rev(colors()))

6 Cellular neighborhood analysis

The following section highlights the use of the imcRtools package to detect cellular neighborhoods. This approach has been proposed by (Goltsev et al. 2018) and (Schürch et al. 2020) to group cells based on information contained in their direct neighborhood.

(Goltsev et al. 2018) perfomed Delaunay triangulation-based graph construction, neighborhood aggregation and then clustered cells. (Schürch et al. 2020) on the other hand constructed a 10-nearest neighbor graph before aggregating information across neighboring cells.

In the following code chunk we will use the 20-nearest neighbor graph as constructed above to define the direct cellular neighborhood. The aggregateNeighbors function allows neighborhood aggregation in 2 different ways:

  1. For each cell the function computes the fraction of cells of a certain type (e.g., cell type) among its neighbors.
  2. For each cell it aggregates (e.g., mean) the expression counts across all neighboring cells.

Based on these measures, cells can now be clustered into cellular neighborhoods. We will first compute the fraction of the different cell types among the 20-nearest neighbors and use kmeans clustering to group cells into 6 cellular neighborhoods.

# By celltypes
spe <- aggregateNeighbors(spe, colPairName = "knn_interaction_graph", 
                          aggregate_by = "metadata", count_by = "celltype")

set.seed(220705)

cn_1 <- kmeans(spe$aggregatedNeighbors, centers = 6)
spe$cn_celltypes <- as.factor(cn_1$cluster)

plotSpatial(spe, 
            node_color_by = "cn_celltypes", 
            img_id = "sample_id", 
            node_size_fix = 0.5) +
    scale_color_brewer(palette = "Set3")

The next code chunk visualizes the cell type compositions of the detected cellular neighborhoods (CN).

library(pheatmap)

for_plot <- prop.table(table(spe$cn_celltypes, spe$celltype), margin = 1)

pheatmap(for_plot, 
         color = colorRampPalette(c("dark blue", "white", "dark red"))(100), 
         scale = "column")

CN 1 and CN 5 are mainly composed of tumor cells with CN 5 forming the tumor/stroma border. CN 3 is mainly composed of B and BnT cells indicating TLS. CN 4 is composed of aggregated plasma cells and most T cells.

7 Spatial context detection

Downstream of CN assignments, we will analyze the spatial context (SC) of each cell using three functions from imcRtools.

While CNs can represent sites of unique local processes, the term SC was coined by Bhate and colleagues (Bhate et al. 2022) and describes tissue regions in which distinct CNs may be interacting. Hence, SCs may be interesting regions of specialized biological events.

Here, we will first detect SCs using the detectSpatialContext function. This function relies on CN fractions for each cell in a spatial interaction graph (originally a KNN graph), which we will calculate using buildSpatialGraph and aggregateNeighbors. We will focus on the CNs derived from cell type fractions but other CN assignments are possible.

Of note, the window size (k for KNN) for buildSpatialGraph should reflect a length scale on which biological signals can be exchanged and depends, among others, on cell density and tissue area. In view of their divergent functionality, we recommend to use a larger window size for SC (interaction between local processes) than for CN (local processes) detection. Since we used a 20-nearest neighbor graph for CN assignment, we will use a 40-nearest neighbor graph for SC detection.

Subsequently, the CN fractions are sorted from high-to-low and the SC of each cell is assigned as the minimal combination of SCs that additively surpass a user-defined threshold. The default threshold of 0.9 aims to represent the dominant CNs, hence the most prevalent signals, in a given window.

For more details and biological validation, please refer to (Bhate et al. 2022).

library(circlize)
library(RColorBrewer)

# Generate k-nearest neighbor graph for SC detection (k=40) 
spe <- buildSpatialGraph(spe, img_id = "sample_id", 
                         type = "knn", 
                         name = "knn_spatialcontext_graph", 
                         k = 40)
# Aggregate based on clustered_neighbors
spe <- aggregateNeighbors(spe, 
                          colPairName = "knn_spatialcontext_graph",
                          aggregate_by = "metadata",
                          count_by = "cn_celltypes",
                          name = "aggregatedNeighborhood")
# Detect spatial contexts
spe <- detectSpatialContext(spe, 
                            entry = "aggregatedNeighborhood",
                            threshold = 0.90,
                            name = "spatial_context")
# Define SC color scheme
col_SC <- setNames(colorRampPalette(brewer.pal(9, "Paired"))(length(unique(spe$spatial_context))), 
                   sort(unique(spe$spatial_context)))

We detect a total of 50 distinct SCs across this dataset.

For ease of interpretation, we will directly compare the CN and SC assignments for Patient3_001.

library(patchwork)

# Compare CN and SC for one patient 
p1 <- plotSpatial(spe[,spe$sample_id == "Patient3_001"], 
            node_color_by = "cn_celltypes", 
            img_id = "sample_id", 
            node_size_fix = 0.5, 
            colPairName = "knn_interaction_graph") +
    scale_color_brewer(palette = "Set3")

p2 <- plotSpatial(spe[,spe$sample_id == "Patient3_001"], 
            node_color_by = "spatial_context", 
            img_id = "sample_id", 
            node_size_fix = 0.5, 
            colPairName = "knn_spatialcontext_graph") +
    scale_color_manual(values = col_SC, limits = force)

p1 + p2

As expected, we can observe that interfaces between different CNs make up distinct SCs. For instance, interface between CN 3 (TLS region consisting of B and BnT cells) and CN 4 (Plasma- and T-cell dominated) turns to SC 3_4. On the other hand, the core of CN 3 becomes SC 3, since for the neighborhood for these cells is just the cellular neighborhood itself.

Next, we filter the SCs based on user-defined thresholds for number of group entries (here at least 3 patients) and/or total number of cells (here minimum of 100 cells) per SC with filterSpatialContext.

# By number of group entries and total number of cells
spe <- filterSpatialContext(spe, 
                            entry = "spatial_context",
                            group_by = "patient_id", 
                            group_threshold = 3,
                            cells_threshold = 100)

Lastly, we can use the plotSpatialContext function to generate SC graphs, analogous to CN combination maps in (Bhate et al. 2022). Returned objects are ggplot objects, which can be easily modified further. We will create a SC graph for the filtered SCs here.

# Colored by name and size by n_cells
plotSpatialContext(spe, 
                   entry = "spatial_context_filtered",
                   group_by = "sample_id",
                   node_color_by = "name",
                   node_size_by = "n_cells",
                   node_label_color_by = "name")

# Colored by n_cells and size by n_group                   
plotSpatialContext(spe, 
                   entry = "spatial_context_filtered",
                   group_by = "sample_id",
                   node_color_by = "n_cells",
                   node_size_by = "n_group",
                   node_label_color_by = "n_cells") +
  scale_color_viridis()

SC 1 (Tumor-dominated), SC 1_5 (Tumor and Tumor-Stroma interface) and SC 2_4 (Plasma/T cell and Myeloid/Neutrophil interface) are the most frequent SCs in this dataset. Moreover, we may compare the degree of the different nodes in the SC graph. For example, we can observe that SC 1 has only one degree (directed to SC 1_5), while SC 5 (Tumor-Stroma) has a much higher degree (n = 5) and potentially more interaction.

8 Patch detection

The previous section focused on detecting cellular neighborhoods in a rather unsupervised fashion. However, the imcRtools package also provides methods for detecting spatial compartments in a supervised fashion. The patchDetection function allows the detection of connected sets of similar cells as proposed by (Hoch et al. 2022). In the following example, we will use the patchDetection function to detect tumor patches in three steps:

  1. Find connected sets of tumor cells (using the steinbock graph).
  2. Components which contain less than 10 cells are excluded.
  3. Expand the components by 1µm to construct a concave hull around the patch and include cells within the patch.
spe <- patchDetection(spe, 
                      patch_cells = spe$celltype == "Tumor",
                      img_id = "sample_id",
                      expand_by = 1,
                      min_patch_size = 10,
                      colPairName = "neighborhood")

plotSpatial(spe, 
            node_color_by = "patch_id", 
            img_id = "sample_id", 
            node_size_fix = 0.5) +
    theme(legend.position = "none") +
    scale_color_manual(values = rev(colors()))

We can now measure the size of each patch using the patchSize function and visualize tumor patch distribution per patient.

patch_size <- patchSize(spe, "patch_id")

patch_size <- merge(patch_size, 
                    colData(spe)[match(patch_size$patch_id, spe$patch_id),], 
                    by = "patch_id")

ggplot(as.data.frame(patch_size)) + 
    geom_boxplot(aes(patient_id, log10(size))) +
    geom_point(aes(patient_id, log10(size)))

The minDistToCells function can be used to calculate the minimum distance between each cell and a cell set of interest. Here, we highlight its use to calculate the minimum distance of all cells to the detected tumor patches. Negative values indicate the minimum distance of each tumor patch cell to a non-tumor patch cell.

spe <- minDistToCells(spe, 
                      x_cells = !is.na(spe$patch_id), 
                      img_id = "sample_id")

plotSpatial(spe, 
            node_color_by = "distToCells", 
            img_id = "sample_id", 
            node_size_fix = 0.5) +
    scale_color_gradient2(low = "dark blue", mid = "white", high = "dark red")

We can now observe th minimum distances to tumor patches in a cell type specific manner.

library(ggridges)
ggplot(as.data.frame(colData(spe))) + 
    geom_density_ridges(aes(distToCells, celltype, fill = celltype)) +
    geom_vline(xintercept = 0, color = "dark red", size = 2) +
    scale_fill_manual(values = metadata(spe)$color_vectors$celltype)

9 Interaction analysis

The next section focuses on statistically testing the pairwise interaction between all cell types of the dataset. For this, the imcRtools package provides the testInteractions function which implements the interaction testing strategy proposed by (Schapiro et al. 2017).

Per grouping level (e.g., image), the testInteractions function computes the averaged cell type/cell type interaction count and compares this count against an empirical null distribution which is generated by permuting all cell labels (while maintaining the tissue structure).

In the following example, we use the steinbock generated spatial interaction graph and estimate the interaction or avoidance between cell types in the dataset.

out <- testInteractions(spe, 
                        group_by = "sample_id",
                        label = "celltype", 
                        colPairName = "neighborhood", 
                        iter = 200)

head(out)
## DataFrame with 6 rows and 10 columns
##       group_by from_label   to_label        ct       p_gt      p_lt interaction
##    <character>   <factor>   <factor> <numeric>  <numeric> <numeric>   <logical>
## 1 Patient1_001      Bcell Bcell              1 0.00995025  1.000000        TRUE
## 2 Patient1_001      Bcell BnTcell            0 1.00000000  1.000000       FALSE
## 3 Patient1_001      Bcell CD4                2 0.00497512  1.000000        TRUE
## 4 Patient1_001      Bcell CD8                0 1.00000000  0.855721       FALSE
## 5 Patient1_001      Bcell Myeloid            0 1.00000000  0.686567       FALSE
## 6 Patient1_001      Bcell Neutrophil        NA         NA        NA          NA
##            p       sig    sigval
##    <numeric> <logical> <numeric>
## 1 0.00995025      TRUE         1
## 2 1.00000000     FALSE         0
## 3 0.00497512      TRUE         1
## 4 0.85572139     FALSE         0
## 5 0.68656716     FALSE         0
## 6         NA        NA        NA

The returned DataFrame contains the test results per grouping level (in this case the image ID, group_by), “from” cell type (from_label) and “to” cell type (to_label). The sigval entry indicates if a pair of cell types is significantly interacting (sigval = 1), if a pair of cell types is significantly avoiding (sigval = -1) or if no significant interaction or avoidance was detected.

These results can be visualized by computing the sum of the sigval entries across all images:

library(tidyverse)
library(scales)
out %>% as_tibble() %>%
    group_by(from_label, to_label) %>%
    summarize(sum_sigval = sum(sigval, na.rm = TRUE)) %>%
    ggplot() +
        geom_tile(aes(from_label, to_label, fill = sum_sigval)) +
        scale_fill_gradient2(low = muted("blue"), mid = "white", high = muted("red")) +
        theme(axis.text.x = element_text(angle = 45, hjust = 1))

In the plot above the red tiles indicate cell type pairs that were detected to significantly interact on a large number of images. On the other hand, blue tiles show cell type pairs which tend to avoid each other on a large number of images.

Here we can observe that tumor cells are mostly compartmentalized and are in avoidance with other cell types. As expected, B cells interact with BnT cells; regulatory T cells interact with CD4+ T cells and CD8+ T cells. Most cell types show self interactions indicating spatial clustering.

The imcRtools package further implements an interaction testing strategy proposed by (Schulz et al. 2018) where the hypothesis is tested if at least n cells of a certain type are located around a target cell type (from_cell). This type of testing can be performed by selecting method = "patch" and specifying the number of patch cells via the patch_size parameter.

out <- testInteractions(spe, 
                        group_by = "sample_id",
                        label = "celltype", 
                        colPairName = "neighborhood",
                        method = "patch", 
                        patch_size = 3,
                        iter = 200)

out %>% as_tibble() %>%
    group_by(from_label, to_label) %>%
    summarize(sum_sigval = sum(sigval, na.rm = TRUE)) %>%
    ggplot() +
        geom_tile(aes(from_label, to_label, fill = sum_sigval)) +
        scale_fill_gradient2(low = muted("blue"), mid = "white", high = muted("red")) +
        theme(axis.text.x = element_text(angle = 45, hjust = 1))

These results are comparable to the interaction testing presented above. The main difference comes from the lack of symmetry. We can now for example see that 3 or more myeloid cells sit around CD4\(^+\) T cells while this interaction is not as strong when considering CD4\(^+\) T cells sitting around myeloid cells.

10 Further resources

The IMC data analysis book contains a detailed overview on the presented and other approaches for multiplexed image analysis and visualization.

The steinbock framework provides functionality for image processing.

The ImcSegmentationPipeline provides a GUI-based version of the segmentation pipeline based on Ilastik pixel classification and image segmentation via CellProfiler.

The cytomapper package allows handling and visualization of multi-channel images and segmentation masks directly in R.

The imcRtools package supports reading single-cell data from segmented images, multi-channel spillover correction, and spatial data analysis.

The imcdatasets R/Bioconductor package contains a number of publically available IMC datasets.

For a full overview on the presented approaches, please refer to the preprint:

Jonas Windhager, Bernd Bodenmiller, Nils Eling. An end-to-end workflow for multiplexed image processing and analysis. bioRxiv, 2021

11 Acknowledgments

Jonas Windhager developed the steinbock framework. Vito Zanotelli developed the IMC segmentation pipeline. Daniel Schulz, Tobias Hoch, Lasse Meyer, Jana Fischer, and Vito Zanotelli provided code for the imcRtools package.

Nils Eling is funded by Marie Sklodowska Curie Actions.

Session info

## R version 4.2.1 (2022-06-23)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Catalina 10.15.7
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] scales_1.2.1                forcats_0.5.2              
##  [3] stringr_1.4.1               dplyr_1.0.10               
##  [5] purrr_0.3.4                 readr_2.1.2                
##  [7] tidyr_1.2.1                 tibble_3.1.8               
##  [9] tidyverse_1.3.2             ggridges_0.5.3             
## [11] patchwork_1.1.2             RColorBrewer_1.1-3         
## [13] circlize_0.4.15             pheatmap_1.0.12            
## [15] igraph_1.3.4                viridis_0.6.2              
## [17] viridisLite_0.4.1           ggplot2_3.3.6              
## [19] imcRtools_1.3.7             SpatialExperiment_1.6.1    
## [21] SingleCellExperiment_1.18.0 SummarizedExperiment_1.26.1
## [23] Biobase_2.56.0              GenomicRanges_1.48.0       
## [25] GenomeInfoDb_1.32.4         IRanges_2.30.1             
## [27] S4Vectors_0.34.0            BiocGenerics_0.42.0        
## [29] MatrixGenerics_1.8.1        matrixStats_0.62.0         
## [31] BiocStyle_2.24.0           
## 
## loaded via a namespace (and not attached):
##   [1] scattermore_0.8             flowWorkspace_4.8.0        
##   [3] R.methodsS3_1.8.2           bit64_4.0.5                
##   [5] knitr_1.40                  irlba_2.3.5                
##   [7] multcomp_1.4-20             DelayedArray_0.22.0        
##   [9] R.utils_2.12.0              data.table_1.14.2          
##  [11] RCurl_1.98-1.8              doParallel_1.0.17          
##  [13] generics_0.1.3              flowCore_2.8.0             
##  [15] ScaledMatrix_1.4.0          terra_1.6-7                
##  [17] cowplot_1.1.1               TH.data_1.1-1              
##  [19] proxy_0.4-27                ggpointdensity_0.1.0       
##  [21] bit_4.0.4                   tzdb_0.3.0                 
##  [23] lubridate_1.8.0             xml2_1.3.3                 
##  [25] httpuv_1.6.6                assertthat_0.2.1           
##  [27] gargle_1.2.1                fontawesome_0.3.0          
##  [29] xfun_0.32                   hms_1.1.2                  
##  [31] jquerylib_0.1.4             evaluate_0.16              
##  [33] promises_1.2.0.1            fansi_1.0.3                
##  [35] readxl_1.4.1                dbplyr_2.2.1               
##  [37] Rgraphviz_2.40.0            DBI_1.1.3                  
##  [39] CATALYST_1.20.1             htmlwidgets_1.5.4          
##  [41] googledrive_2.0.0           ellipsis_0.3.2             
##  [43] ggcyto_1.24.1               ggnewscale_0.4.7           
##  [45] ggpubr_0.4.0                backports_1.4.1            
##  [47] V8_4.2.1                    cytolib_2.8.0              
##  [49] bookdown_0.28               svgPanZoom_0.3.4           
##  [51] RcppParallel_5.1.5          deldir_1.0-6               
##  [53] sparseMatrixStats_1.8.0     vctrs_0.4.1                
##  [55] abind_1.4-5                 cachem_1.0.6               
##  [57] withr_2.5.0                 ggforce_0.3.4              
##  [59] aws.signature_0.6.0         cytomapper_1.9.1           
##  [61] vroom_1.5.7                 svglite_2.1.0              
##  [63] cluster_2.1.4               crayon_1.5.1               
##  [65] drc_3.0-1                   labeling_0.4.2             
##  [67] units_0.8-0                 edgeR_3.38.4               
##  [69] pkgconfig_2.0.3             tweenr_2.0.2               
##  [71] vipor_0.4.5                 rlang_1.0.5                
##  [73] lifecycle_1.0.2             sandwich_3.0-2             
##  [75] modelr_0.1.9                rsvd_1.0.5                 
##  [77] cellranger_1.1.0            polyclip_1.10-0            
##  [79] graph_1.74.0                tiff_0.1-11                
##  [81] Matrix_1.4-1                raster_3.5-29              
##  [83] carData_3.0-5               Rhdf5lib_1.18.2            
##  [85] zoo_1.8-10                  reprex_2.0.2               
##  [87] base64enc_0.1-3             beeswarm_0.4.0             
##  [89] RTriangle_1.6-0.10          googlesheets4_1.0.1        
##  [91] GlobalOptions_0.1.2         png_0.1-7                  
##  [93] rjson_0.2.21                shinydashboard_0.7.2       
##  [95] bitops_1.0-7                R.oo_1.25.0                
##  [97] KernSmooth_2.23-20          ConsensusClusterPlus_1.60.0
##  [99] rhdf5filters_1.8.0          DelayedMatrixStats_1.18.0  
## [101] classInt_0.4-7              shape_1.4.6                
## [103] jpeg_0.1-9                  rstatix_0.7.0              
## [105] ggsignif_0.6.3              aws.s3_0.3.21              
## [107] beachmat_2.12.0             magrittr_2.0.3             
## [109] plyr_1.8.7                  hexbin_1.28.2              
## [111] zlibbioc_1.42.0             compiler_4.2.1             
## [113] dqrng_0.3.0                 concaveman_1.1.0           
## [115] plotrix_3.8-2               clue_0.3-61                
## [117] cli_3.4.0                   XVector_0.36.0             
## [119] ncdfFlow_2.42.1             FlowSOM_2.4.0              
## [121] MASS_7.3-58.1               tidyselect_1.1.2           
## [123] stringi_1.7.8               RProtoBufLib_2.8.0         
## [125] highr_0.9                   yaml_2.3.5                 
## [127] BiocSingular_1.12.0         locfit_1.5-9.6             
## [129] latticeExtra_0.6-30         ggrepel_0.9.1              
## [131] grid_4.2.1                  sass_0.4.2                 
## [133] EBImage_4.38.0              tools_4.2.1                
## [135] parallel_4.2.1              CytoML_2.8.1               
## [137] rstudioapi_0.14             foreach_1.5.2              
## [139] gridExtra_2.3               farver_2.1.1               
## [141] Rtsne_0.16                  ggraph_2.0.6               
## [143] DropletUtils_1.16.0         digest_0.6.29              
## [145] BiocManager_1.30.18         shiny_1.7.2                
## [147] Rcpp_1.0.9                  car_3.1-0                  
## [149] broom_1.0.1                 scuttle_1.6.3              
## [151] later_1.3.0                 httr_1.4.4                 
## [153] sf_1.0-8                    ComplexHeatmap_2.12.1      
## [155] distances_0.1.8             colorspace_2.0-3           
## [157] rvest_1.0.3                 fs_1.5.2                   
## [159] XML_3.99-0.10               splines_4.2.1              
## [161] RBGL_1.72.0                 scater_1.24.0              
## [163] graphlayouts_0.8.1          sp_1.5-0                   
## [165] systemfonts_1.0.4           xtable_1.8-4               
## [167] jsonlite_1.8.0              tidygraph_1.2.2            
## [169] R6_2.5.1                    pillar_1.8.1               
## [171] htmltools_0.5.3             mime_0.12                  
## [173] nnls_1.4                    glue_1.6.2                 
## [175] fastmap_1.1.0               DT_0.24                    
## [177] BiocParallel_1.30.3         BiocNeighbors_1.14.0       
## [179] fftwtools_0.9-11            class_7.3-20               
## [181] codetools_0.2-18            mvtnorm_1.1-3              
## [183] utf8_1.2.2                  lattice_0.20-45            
## [185] bslib_0.4.0                 curl_4.3.2                 
## [187] ggbeeswarm_0.6.0            colorRamps_2.3.1           
## [189] gtools_3.9.3                magick_2.7.3               
## [191] interp_1.1-3                survival_3.4-0             
## [193] limma_3.52.2                rmarkdown_2.16             
## [195] munsell_0.5.0               e1071_1.7-11               
## [197] GetoptLong_1.0.5            rhdf5_2.40.0               
## [199] GenomeInfoDbData_1.2.8      iterators_1.0.14           
## [201] HDF5Array_1.24.2            haven_2.5.1                
## [203] reshape2_1.4.4              gtable_0.3.1

References

Angelo, Michael, Sean C. Bendall, Rachel Finck, Matthew B. Hale, Chuck Hitzman, Alexander D. Borowsky, Richard M. Levenson, et al. 2014. “Multiplexed Ion Beam Imaging of Human Breast Tumors.” Nature Medicine 20 (4): 436–42.
Bhate, Salil S., Graham L. Barlow, Christian M. Schürch, and Garry P. Nolan. 2022. “Tissue Schematics Map the Specialization of Immune Tissue Motifs and Their Appropriation by Tumors.” Cell Systems 13 (2): 109–30.
Eling, Nils, Nicolas Damond, Tobias Hoch, and Bernd Bodenmiller. 2020. “Cytomapper: An r/Bioconductor Package for Visualization of Highly Multiplexed Imaging Data.” Bioinformatics 36 (24): 5706--5708. https://doi.org/10.1093/bioinformatics/btaa1061.
Giesen, Charlotte, Hao A. O. Wang, Denis Schapiro, Nevena Zivanovic, Andrea Jacobs, Bodo Hattendorf, Peter J. Schüffler, et al. 2014. “Highly Multiplexed Imaging of Tumor Tissues with Subcellular Resolution by Mass Cytometry.” Nature Methods 11 (4): 417–22.
Goltsev, Yury, Nikolay Samusik, Julia Kennedy-Darling, Salil Bhate, Matthew Hale, Gustavo Vazquez, Sarah Black, and Garry P. Nolan. 2018. “Deep Profiling of Mouse Splenic Architecture with CODEX Multiplexed Imaging.” Cell 174: 968–81.
Gut, Gabriele, Markus D Herrmann, and Lucas Pelkmans. 2018. “Multiplexed Protein Maps Link Subcellular Organization to Cellular States.” Science 361: 1–13.
Hoch, Tobias, Daniel Schulz, Nils Eling, Julia Martínez Gómez, Mitchell P. Levesque, and Bernd Bodenmiller. 2022. “Multiplexed Imaging Mass Cytometry of the Chemokine Milieus in Melanoma Characterizes Features of the Response to Immunotherapy.” Science Immunology 7 (70): eabk1692. https://doi.org/10.1126/sciimmunol.abk1692.
Jackson, Hartland W., Jana R. Fischer, Vito R. T. Zanotelli, H. Raza Ali, Robert Mechera, Savas D. Soysal, Holger Moch, et al. 2020. “The Single-Cell Pathology Landscape of Breast Cancer.” Nature 578: 615–20.
Lin, Jia-Ren, Benjamin Izar, Shu Wang, Clarence Yapp, Shaolin Mei, Parin M. Shah, Sandro Santagata, and Peter K. Sorger. 2018. “Highly Multiplexed Immunofluorescence Imaging of Human Tissues and Tumors Using t-CyCIF and Conventional Optical Microscopes.” eLife 7: 1–46.
Saka, Sinem K., Yu Wang, Jocelyn Y. Kishi, Allen Zhu, Yitian Zeng, Wenxin Xie, Koray Kirli, et al. 2019. “Immuno-SABER Enables Highly Multiplexed and Amplified Protein Imaging in Tissues.” Nature Biotechnology 37: 1080–90.
Schapiro, Denis, Hartland W Jackson, Swetha Raghuraman, Jana R Fischer, Vito RT Zanotelli, Daniel Schulz, Charlotte Giesen, Raúl Catena, Zsuzsanna Varga, and Bernd Bodenmiller. 2017. “histoCAT: Analysis of Cell Phenotypes and Interactions in Multiplex Image Cytometry Data.” Nature Methods 14: 873--876.
Schulz, Daniel, Vito RT Zanotelli, Rana R Fischer, Denis Schapiro, Stefanie Engler, Xiao-Kang Lun, Hartland W Jackson, and Bernd Bodenmiller. 2018. “Simultaneous Multiplexed Imaging of mRNA and Proteins with Subcellular Resolution in Breast Cancer Tissue Samples by Mass Cytometry.” Cell Systems 6: 25–36.e5.
Schürch, Christian M, Salil S Bhate, Graham L Barlow, Darci J Phillips, Luca Noti, Inti Zlobec, Pauline Chu, et al. 2020. “Coordinated Cellular Neighborhoods Orchestrate Antitumoral Immunity at the Colorectal Cancer Invasive Front.” Cell 182: 1341–59.